home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
tsql
/
doc
/
tsql.mail
/
000040_csj@iesd.auc.dk _Tue Mar 16 23:39:48 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1996-01-31
|
7KB
Received: from iesd.auc.dk by optima.cs.arizona.edu (5.65c/15) via SMTP
id AA27274; Tue, 16 Mar 1993 15:39:32 MST
Received: from yellow.iesd.auc.dk by iesd.auc.dk with SMTP id AA07007
(5.65c8/IDA-1.5/MD for <tsql@cs.arizona.edu>); Tue, 16 Mar 1993 23:39:48 +0100
Date: Tue, 16 Mar 1993 23:39:48 +0100
From: "Christian S. Jensen" <csj@iesd.auc.dk>
Message-Id: <199303162239.AA07007@iesd.auc.dk>
To: tsql@cs.arizona.edu
Subject: Re: Benchmark initiative
Jim, Al, and Alex,
I hope you all survived the worst snow storm in 100 years :-)
Thanks for your interest in the TSQL Benchmark. I have read your
posting to the tsql list with great interest. I agree with your
observations, and I thank you for taking the time to communicate them.
Let me now attempt to address your concerns within the context of the
benchmark effort. Some of your concerns can be met now, and the rest
may need to await a later edition of the benchmark.
>From info-tsql-sender@cs.arizona.edu Tue Mar 16 20:42:45 1993
>
>We would like to make two comments on the proposal to develop a
>comprehensive set of natural language queries as a test of "goodness"
>of various query languages and algebras.
In the initial proposal for a database schema for the benchmark (sent
to tsql a few days ago), I included this characterization (based on
Rick's postings to the tsql list).
"The central goal of this document is to provide the temporal database
community with a {\em comprehensive consensus benchmark} for temporal
query languages which is {\em independent} of any existing language
proposal.
This is not a performance benchmark, but a {\em semantic} benchmark
which is intended to be an aid in evaluating the user-friendliness of
proposals for temporal query languages. Thus, temporal query languages
should ideally be able to express the benchmark queries both
conveniently and naturally."
To me, the key word is user-friendliness. The benchmark is intended to
be a valuable tool for designers of user-level query languages in
general and of a TSQL language, in particular. The benchmark is not
intended to cover algebras which are not user-level languages. Using
the benchmark, a designer may become aware of boolean seams (Shashi
has shown the importance of avoiding these), of violations of the
0-1-infinity principle, of lack of orthogonality, of other types of
inconsistencies (see the PL literature), etc.
>First, we feel that a certain classification of various queries has to
>be established before the set of queries is proposed. As an *initial*
>suggestion, we can classify temporal queries as follows.
>
> | HISTORICAL | ROLLBACK | BITEMPORAL
>--------------------------------------------------------
>UNGROUPED | | |
> | | |
>GROUPED | | |
> | | |
>TEMP. AGGREGATES | | |
> | | |
>SCHEMA VERSIONING| | |
> | | |
>OTHER FEATURES | | |
>--------------------------------------------------------
>
>Then we can place one or several queries in each cell of this matrix.
I think you are exactly right that a classification (or taxonomy) is
needed before queries should be proposed. That was what I tried to
indicate in the introductory statement of the schema proposal and in
the plan that accompanied the schema proposal.
"% The purpose of the following draft is then to define a taxonomy to
% be used for categorizing the benchmark queries that will follow."
"Three tasks must be accomplished initially.
Task 1: Agree on a database schema.
Task 2: Agree on an instance of the schema.
Task 3: Agree on a suitable taxonomy for the benchmark queries.
These tasks will be addressed sequentially during the next weeks. When
they are completed, the benchmark will be populated with queries."
I think your taxonomy is very helpful. It is very much a top-down
taxonomy that makes it possible to classify a very wide variety of
queries.
In order for the current version of the benchmark to not become too
big (I would not like it to exceed, say, 100 queries), Rick suggested
a restricted scope. Thus, only valid time is addressed, and aggregates
as well as schema versioning are presently not addressed. As a
result, most queries in the initial benchmark will fall into only a
couple of categories, making additional refinement desirable.
I will request the help of you, Shashi, Ed, Patrick, Fabio, Maria,
Paolo, and Abdullah (among others) when Task 3 is addressed. Grouping
will certainly be an important aspect of this taxonomy and of the
queries themselves, and I hope that you will be able to help ensure
that it is present.
>The second comment is that the development of the benchmark should not
>be a substitute for a rigorous *theoretical* study of expressive
>powers of various temporal query languages and algebras. It is not
>entirely clear what the goal of the benchmark is. What seems
>necessary is a kind of typography of temporal data models as suggested
>by the above table and discussion. For example, one data model can be
>grouped bitemporal, another be ungrouped historical with aggregates.
I will emphasize in the introduction of the schema proposal document
that the benchmark is not intended to be such a substitute.
The current formulation is:
"While the benchmark is not intended to constitute a metric for query
language completeness, ..."
I will change the formulation to this:
"The benchmark is {\em not} intended to constitute a metric for query
language completeness, and as such it is not a substitute for a
rigorous {\em theoretical} study of expressive powers of various
temporal query languages."
Making a stronger and more explicit point out of this is very
appropriate. I hoped that the introduction would have explained what
was the goal of the benchmark. I will gratefully accept additional
clarifications, to be added to the document.
>Perhaps, the benchmark can be useful in developing such a typography.
>Each type of data model should support a class of queries the model
>embodies and should have its own standard of completeness. We believe
>that this standard should be developed in the terms of an appropriate
>logic (as in the classical relational case) rather than trying to
>determine expressive power "by consensus" (we would not want to say
>that one language is more expressive than another if it can express
>95% of the benchmark queries and the other one only 87%).
This is again an important point! I have seen papers argue that one
algebra is better than another algebra because the former satisfies
more criteria (from the Comp Surveys paper by Rick and Ed M) than the
latter. That is very unfortunate. Similarly, we must try to avoid this
use of the benchmark. For now, I'll add the following text to the
introduction. This text may have to be refined later on.
"It it emphasized that using the benchmark as an advanced,
quantitative scoring system for comparing languages makes little
sense. Thus, one language is not necesarily superior to another just
because one is capable of expressing more benchmark queries than the
other. Rather, the focus is on user-friendliness."
Best regards,
Christian